Overview

Dataset statistics

Number of variables14
Number of observations506
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory55.5 KiB
Average record size in memory112.3 B

Variable types

Numeric13
Categorical1

Alerts

CHAS has constant value "0.0" Constant
CRIM is highly correlated with ZN and 8 other fieldsHigh correlation
ZN is highly correlated with CRIM and 4 other fieldsHigh correlation
INDUS is highly correlated with CRIM and 7 other fieldsHigh correlation
NOX is highly correlated with CRIM and 8 other fieldsHigh correlation
RM is highly correlated with LSTAT and 1 other fieldsHigh correlation
AGE is highly correlated with CRIM and 7 other fieldsHigh correlation
DIS is highly correlated with CRIM and 6 other fieldsHigh correlation
RAD is highly correlated with CRIM and 2 other fieldsHigh correlation
TAX is highly correlated with CRIM and 7 other fieldsHigh correlation
PTRATIO is highly correlated with pricesHigh correlation
LSTAT is highly correlated with CRIM and 7 other fieldsHigh correlation
prices is highly correlated with CRIM and 7 other fieldsHigh correlation
CRIM is highly correlated with INDUS and 7 other fieldsHigh correlation
ZN is highly correlated with INDUS and 3 other fieldsHigh correlation
INDUS is highly correlated with CRIM and 8 other fieldsHigh correlation
NOX is highly correlated with CRIM and 8 other fieldsHigh correlation
RM is highly correlated with LSTAT and 1 other fieldsHigh correlation
AGE is highly correlated with CRIM and 6 other fieldsHigh correlation
DIS is highly correlated with CRIM and 7 other fieldsHigh correlation
RAD is highly correlated with CRIM and 4 other fieldsHigh correlation
TAX is highly correlated with CRIM and 7 other fieldsHigh correlation
PTRATIO is highly correlated with pricesHigh correlation
LSTAT is highly correlated with CRIM and 7 other fieldsHigh correlation
prices is highly correlated with CRIM and 6 other fieldsHigh correlation
CRIM is highly correlated with INDUS and 5 other fieldsHigh correlation
ZN is highly correlated with INDUS and 1 other fieldsHigh correlation
INDUS is highly correlated with CRIM and 3 other fieldsHigh correlation
NOX is highly correlated with CRIM and 4 other fieldsHigh correlation
AGE is highly correlated with CRIM and 2 other fieldsHigh correlation
DIS is highly correlated with CRIM and 3 other fieldsHigh correlation
RAD is highly correlated with CRIM and 1 other fieldsHigh correlation
TAX is highly correlated with CRIM and 1 other fieldsHigh correlation
LSTAT is highly correlated with pricesHigh correlation
prices is highly correlated with LSTATHigh correlation
CRIM is highly correlated with INDUS and 9 other fieldsHigh correlation
ZN is highly correlated with INDUS and 6 other fieldsHigh correlation
INDUS is highly correlated with CRIM and 8 other fieldsHigh correlation
NOX is highly correlated with CRIM and 9 other fieldsHigh correlation
RM is highly correlated with PTRATIO and 2 other fieldsHigh correlation
AGE is highly correlated with CRIM and 8 other fieldsHigh correlation
DIS is highly correlated with CRIM and 9 other fieldsHigh correlation
RAD is highly correlated with CRIM and 9 other fieldsHigh correlation
TAX is highly correlated with CRIM and 6 other fieldsHigh correlation
PTRATIO is highly correlated with CRIM and 10 other fieldsHigh correlation
B is highly correlated with CRIMHigh correlation
LSTAT is highly correlated with CRIM and 7 other fieldsHigh correlation
prices is highly correlated with CRIM and 8 other fieldsHigh correlation
ZN has 372 (73.5%) zeros Zeros

Reproduction

Analysis started2022-07-19 08:43:25.813355
Analysis finished2022-07-19 08:43:59.372099
Duration33.56 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

CRIM
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct440
Distinct (%)87.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.225032011
Minimum0.00632
Maximum9.06963875
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum0.00632
5-th percentile0.02791
Q10.082045
median0.25651
Q33.6770825
95-th percentile9.06963875
Maximum9.06963875
Range9.06331875
Interquartile range (IQR)3.5950375

Descriptive statistics

Standard deviation3.313353374
Coefficient of variation (CV)1.48912616
Kurtosis-0.05011520597
Mean2.225032011
Median Absolute Deviation (MAD)0.22145
Skewness1.282312917
Sum1125.866197
Variance10.97831058
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.0696387566
 
13.0%
0.015012
 
0.4%
0.006321
 
0.2%
0.042971
 
0.2%
0.055611
 
0.2%
0.064661
 
0.2%
0.141031
 
0.2%
0.053721
 
0.2%
0.129321
 
0.2%
0.081991
 
0.2%
Other values (430)430
85.0%
ValueCountFrequency (%)
0.006321
0.2%
0.009061
0.2%
0.010961
0.2%
0.013011
0.2%
0.013111
0.2%
0.01361
0.2%
0.013811
0.2%
0.014321
0.2%
0.014391
0.2%
0.015012
0.4%
ValueCountFrequency (%)
9.0696387566
13.0%
8.982961
 
0.2%
8.792121
 
0.2%
8.716751
 
0.2%
8.644761
 
0.2%
8.492131
 
0.2%
8.267251
 
0.2%
8.248091
 
0.2%
8.200581
 
0.2%
8.151741
 
0.2%

ZN
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.963438735
Minimum0
Maximum31.25
Zeros372
Zeros (%)73.5%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q312.5
95-th percentile31.25
Maximum31.25
Range31.25
Interquartile range (IQR)12.5

Descriptive statistics

Standard deviation12.02878755
Coefficient of variation (CV)1.727420605
Kurtosis-0.2251670314
Mean6.963438735
Median Absolute Deviation (MAD)0
Skewness1.261339549
Sum3523.5
Variance144.6917299
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0372
73.5%
31.2568
 
13.4%
2021
 
4.2%
12.510
 
2.0%
2510
 
2.0%
2210
 
2.0%
306
 
1.2%
214
 
0.8%
283
 
0.6%
181
 
0.2%
ValueCountFrequency (%)
0372
73.5%
12.510
 
2.0%
17.51
 
0.2%
181
 
0.2%
2021
 
4.2%
214
 
0.8%
2210
 
2.0%
2510
 
2.0%
283
 
0.6%
306
 
1.2%
ValueCountFrequency (%)
31.2568
13.4%
306
 
1.2%
283
 
0.6%
2510
 
2.0%
2210
 
2.0%
214
 
0.8%
2021
 
4.2%
181
 
0.2%
17.51
 
0.2%
12.510
 
2.0%

INDUS
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct76
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.13677866
Minimum0.46
Maximum27.74
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum0.46
5-th percentile2.18
Q15.19
median9.69
Q318.1
95-th percentile21.89
Maximum27.74
Range27.28
Interquartile range (IQR)12.91

Descriptive statistics

Standard deviation6.860352941
Coefficient of variation (CV)0.6160087358
Kurtosis-1.233539601
Mean11.13677866
Median Absolute Deviation (MAD)6.32
Skewness0.2950215679
Sum5635.21
Variance47.06444247
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18.1132
26.1%
19.5830
 
5.9%
8.1422
 
4.3%
6.218
 
3.6%
21.8915
 
3.0%
3.9712
 
2.4%
9.912
 
2.4%
8.5611
 
2.2%
10.5911
 
2.2%
5.8610
 
2.0%
Other values (66)233
46.0%
ValueCountFrequency (%)
0.461
 
0.2%
0.741
 
0.2%
1.211
 
0.2%
1.221
 
0.2%
1.252
0.4%
1.321
 
0.2%
1.381
 
0.2%
1.472
0.4%
1.524
0.8%
1.692
0.4%
ValueCountFrequency (%)
27.745
 
1.0%
25.657
 
1.4%
21.8915
 
3.0%
19.5830
 
5.9%
18.1132
26.1%
15.043
 
0.6%
13.925
 
1.0%
13.894
 
0.8%
12.836
 
1.2%
11.935
 
1.0%

CHAS
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
0.0
506 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1518
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0506
100.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
0.0506
100.0%

Most occurring characters

ValueCountFrequency (%)
01012
66.7%
.506
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1012
66.7%
Other Punctuation506
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01012
100.0%
Other Punctuation
ValueCountFrequency (%)
.506
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1518
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01012
66.7%
.506
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1518
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01012
66.7%
.506
33.3%

NOX
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct81
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5546950593
Minimum0.385
Maximum0.871
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum0.385
5-th percentile0.40925
Q10.449
median0.538
Q30.624
95-th percentile0.74
Maximum0.871
Range0.486
Interquartile range (IQR)0.175

Descriptive statistics

Standard deviation0.1158776757
Coefficient of variation (CV)0.2089033853
Kurtosis-0.06466713337
Mean0.5546950593
Median Absolute Deviation (MAD)0.0875
Skewness0.7293079225
Sum280.6757
Variance0.01342763572
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.53823
 
4.5%
0.71318
 
3.6%
0.43717
 
3.4%
0.87116
 
3.2%
0.62415
 
3.0%
0.48915
 
3.0%
0.69314
 
2.8%
0.60514
 
2.8%
0.7413
 
2.6%
0.54412
 
2.4%
Other values (71)349
69.0%
ValueCountFrequency (%)
0.3851
 
0.2%
0.3891
 
0.2%
0.3922
0.4%
0.3941
 
0.2%
0.3982
0.4%
0.44
0.8%
0.4013
0.6%
0.4033
0.6%
0.4043
0.6%
0.4053
0.6%
ValueCountFrequency (%)
0.87116
3.2%
0.778
1.6%
0.7413
2.6%
0.7186
 
1.2%
0.71318
3.6%
0.711
2.2%
0.69314
2.8%
0.6798
1.6%
0.6717
 
1.4%
0.6683
 
0.6%

RM
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct420
Distinct (%)83.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.275312253
Minimum4.7785
Maximum7.7305
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum4.7785
5-th percentile5.314
Q15.8855
median6.2085
Q36.6235
95-th percentile7.5875
Maximum7.7305
Range2.952
Interquartile range (IQR)0.738

Descriptive statistics

Standard deviation0.6302423786
Coefficient of variation (CV)0.1004320348
Kurtosis0.2326437473
Mean6.275312253
Median Absolute Deviation (MAD)0.3455
Skewness0.2966398966
Sum3175.308
Variance0.3972054558
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.730522
 
4.3%
4.77858
 
1.6%
6.1673
 
0.6%
6.4053
 
0.6%
6.1273
 
0.6%
6.2293
 
0.6%
6.4173
 
0.6%
5.7133
 
0.6%
6.1442
 
0.4%
6.1932
 
0.4%
Other values (410)454
89.7%
ValueCountFrequency (%)
4.77858
1.6%
4.881
 
0.2%
4.9031
 
0.2%
4.9061
 
0.2%
4.9261
 
0.2%
4.9631
 
0.2%
4.971
 
0.2%
4.9731
 
0.2%
51
 
0.2%
5.0121
 
0.2%
ValueCountFrequency (%)
7.730522
4.3%
7.6911
 
0.2%
7.6861
 
0.2%
7.6451
 
0.2%
7.611
 
0.2%
7.521
 
0.2%
7.4891
 
0.2%
7.471
 
0.2%
7.4541
 
0.2%
7.421
 
0.2%

AGE
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct356
Distinct (%)70.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.57490119
Minimum2.9
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum2.9
5-th percentile17.725
Q145.025
median77.5
Q394.075
95-th percentile100
Maximum100
Range97.1
Interquartile range (IQR)49.05

Descriptive statistics

Standard deviation28.14886141
Coefficient of variation (CV)0.410483441
Kurtosis-0.9677155942
Mean68.57490119
Median Absolute Deviation (MAD)19.55
Skewness-0.5989626399
Sum34698.9
Variance792.3583985
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10043
 
8.5%
95.44
 
0.8%
964
 
0.8%
98.24
 
0.8%
97.94
 
0.8%
98.84
 
0.8%
87.94
 
0.8%
95.63
 
0.6%
973
 
0.6%
21.43
 
0.6%
Other values (346)430
85.0%
ValueCountFrequency (%)
2.91
0.2%
61
0.2%
6.21
0.2%
6.51
0.2%
6.62
0.4%
6.81
0.2%
7.82
0.4%
8.41
0.2%
8.91
0.2%
9.81
0.2%
ValueCountFrequency (%)
10043
8.5%
99.31
 
0.2%
99.11
 
0.2%
98.93
 
0.6%
98.84
 
0.8%
98.71
 
0.2%
98.51
 
0.2%
98.42
 
0.4%
98.32
 
0.4%
98.24
 
0.8%

DIS
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct410
Distinct (%)81.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.783946838
Minimum1.1296
Maximum9.8208
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum1.1296
5-th percentile1.461975
Q12.100175
median3.20745
Q35.188425
95-th percentile7.8278
Maximum9.8208
Range8.6912
Interquartile range (IQR)3.08825

Descriptive statistics

Standard deviation2.069765042
Coefficient of variation (CV)0.5469857613
Kurtosis-0.01272349261
Mean3.783946838
Median Absolute Deviation (MAD)1.29115
Skewness0.908466509
Sum1914.6771
Variance4.283927328
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.49525
 
1.0%
9.82085
 
1.0%
5.72094
 
0.8%
5.28734
 
0.8%
5.40074
 
0.8%
6.81474
 
0.8%
3.65193
 
0.6%
6.33613
 
0.6%
6.4983
 
0.6%
5.41593
 
0.6%
Other values (400)468
92.5%
ValueCountFrequency (%)
1.12961
0.2%
1.1371
0.2%
1.16911
0.2%
1.17421
0.2%
1.17811
0.2%
1.20241
0.2%
1.28521
0.2%
1.31631
0.2%
1.32161
0.2%
1.33251
0.2%
ValueCountFrequency (%)
9.82085
1.0%
9.22291
 
0.2%
9.22032
 
0.4%
9.18761
 
0.2%
9.08921
 
0.2%
8.90672
 
0.4%
8.79212
 
0.4%
8.69661
 
0.2%
8.53531
 
0.2%
8.3441
 
0.2%

RAD
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.549407115
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum1
5-th percentile2
Q14
median5
Q324
95-th percentile24
Maximum24
Range23
Interquartile range (IQR)20

Descriptive statistics

Standard deviation8.707259384
Coefficient of variation (CV)0.9118115166
Kurtosis-0.8672319936
Mean9.549407115
Median Absolute Deviation (MAD)2
Skewness1.004814648
Sum4832
Variance75.81636598
MonotonicityNot monotonic
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
24132
26.1%
5115
22.7%
4110
21.7%
338
 
7.5%
626
 
5.1%
224
 
4.7%
824
 
4.7%
120
 
4.0%
717
 
3.4%
ValueCountFrequency (%)
120
 
4.0%
224
 
4.7%
338
 
7.5%
4110
21.7%
5115
22.7%
626
 
5.1%
717
 
3.4%
824
 
4.7%
24132
26.1%
ValueCountFrequency (%)
24132
26.1%
824
 
4.7%
717
 
3.4%
626
 
5.1%
5115
22.7%
4110
21.7%
338
 
7.5%
224
 
4.7%
120
 
4.0%

TAX
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct66
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean408.2371542
Minimum187
Maximum711
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum187
5-th percentile222
Q1279
median330
Q3666
95-th percentile666
Maximum711
Range524
Interquartile range (IQR)387

Descriptive statistics

Standard deviation168.5371161
Coefficient of variation (CV)0.4128411987
Kurtosis-1.142407992
Mean408.2371542
Median Absolute Deviation (MAD)73
Skewness0.6699559418
Sum206568
Variance28404.75949
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
666132
26.1%
30740
 
7.9%
40330
 
5.9%
43715
 
3.0%
30414
 
2.8%
26412
 
2.4%
39812
 
2.4%
38411
 
2.2%
27711
 
2.2%
22410
 
2.0%
Other values (56)219
43.3%
ValueCountFrequency (%)
1871
 
0.2%
1887
1.4%
1938
1.6%
1981
 
0.2%
2165
1.0%
2227
1.4%
2235
1.0%
22410
2.0%
2261
 
0.2%
2339
1.8%
ValueCountFrequency (%)
7115
 
1.0%
666132
26.1%
4691
 
0.2%
43715
 
3.0%
4329
 
1.8%
4303
 
0.6%
4221
 
0.2%
4112
 
0.4%
40330
 
5.9%
4022
 
0.4%

PTRATIO
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct45
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.46383399
Minimum13.2
Maximum22
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum13.2
5-th percentile14.7
Q117.4
median19.05
Q320.2
95-th percentile21
Maximum22
Range8.8
Interquartile range (IQR)2.8

Descriptive statistics

Standard deviation2.143924486
Coefficient of variation (CV)0.1161148051
Kurtosis-0.4218644734
Mean18.46383399
Median Absolute Deviation (MAD)1.15
Skewness-0.7624951948
Sum9342.7
Variance4.596412202
MonotonicityNot monotonic
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
20.2140
27.7%
14.734
 
6.7%
2127
 
5.3%
17.823
 
4.5%
19.219
 
3.8%
17.418
 
3.6%
19.117
 
3.4%
18.617
 
3.4%
16.616
 
3.2%
18.416
 
3.2%
Other values (35)179
35.4%
ValueCountFrequency (%)
13.215
3.0%
13.61
 
0.2%
14.41
 
0.2%
14.734
6.7%
14.83
 
0.6%
14.94
 
0.8%
15.11
 
0.2%
15.213
 
2.6%
15.33
 
0.6%
15.51
 
0.2%
ValueCountFrequency (%)
222
 
0.4%
21.215
 
3.0%
21.11
 
0.2%
2127
 
5.3%
20.911
 
2.2%
20.2140
27.7%
20.15
 
1.0%
19.78
 
1.6%
19.68
 
1.6%
19.219
 
3.8%

B
Real number (ℝ≥0)

HIGH CORRELATION

Distinct282
Distinct (%)55.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean381.9188365
Minimum344.10625
Maximum396.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum344.10625
5-th percentile344.10625
Q1375.3775
median391.44
Q3396.225
95-th percentile396.9
Maximum396.9
Range52.79375
Interquartile range (IQR)20.8475

Descriptive statistics

Standard deviation19.05491282
Coefficient of variation (CV)0.04989257142
Kurtosis-0.2305895383
Mean381.9188365
Median Absolute Deviation (MAD)5.46
Skewness-1.164207567
Sum193250.9312
Variance363.0897027
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
396.9121
23.9%
344.1062577
 
15.2%
395.243
 
0.6%
393.743
 
0.6%
396.212
 
0.4%
395.632
 
0.4%
392.782
 
0.4%
395.562
 
0.4%
393.372
 
0.4%
395.112
 
0.4%
Other values (272)290
57.3%
ValueCountFrequency (%)
344.1062577
15.2%
344.911
 
0.2%
347.881
 
0.2%
348.131
 
0.2%
348.931
 
0.2%
349.481
 
0.2%
350.451
 
0.2%
350.651
 
0.2%
351.851
 
0.2%
352.581
 
0.2%
ValueCountFrequency (%)
396.9121
23.9%
396.421
 
0.2%
396.331
 
0.2%
396.31
 
0.2%
396.281
 
0.2%
396.241
 
0.2%
396.231
 
0.2%
396.212
 
0.4%
396.141
 
0.2%
396.062
 
0.4%

LSTAT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct449
Distinct (%)88.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.61201087
Minimum1.73
Maximum31.9625
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum1.73
5-th percentile3.7075
Q16.95
median11.36
Q316.955
95-th percentile26.8075
Maximum31.9625
Range30.2325
Interquartile range (IQR)10.005

Descriptive statistics

Standard deviation7.016828773
Coefficient of variation (CV)0.5563608251
Kurtosis0.08802788778
Mean12.61201087
Median Absolute Deviation (MAD)4.795
Skewness0.8086712082
Sum6381.6775
Variance49.23588604
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31.96257
 
1.4%
6.363
 
0.6%
7.793
 
0.6%
18.133
 
0.6%
14.13
 
0.6%
8.053
 
0.6%
9.52
 
0.4%
8.12
 
0.4%
7.62
 
0.4%
132
 
0.4%
Other values (439)476
94.1%
ValueCountFrequency (%)
1.731
0.2%
1.921
0.2%
1.981
0.2%
2.471
0.2%
2.871
0.2%
2.881
0.2%
2.941
0.2%
2.961
0.2%
2.971
0.2%
2.981
0.2%
ValueCountFrequency (%)
31.96257
1.4%
30.812
 
0.4%
30.631
 
0.2%
30.621
 
0.2%
30.591
 
0.2%
29.971
 
0.2%
29.931
 
0.2%
29.681
 
0.2%
29.551
 
0.2%
29.531
 
0.2%

prices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct207
Distinct (%)40.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.8770751
Minimum5.0625
Maximum36.9625
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB

Quantile statistics

Minimum5.0625
5-th percentile10.2
Q117.025
median21.2
Q325
95-th percentile36.9625
Maximum36.9625
Range31.9
Interquartile range (IQR)7.975

Descriptive statistics

Standard deviation7.602975619
Coefficient of variation (CV)0.3475316323
Kurtosis-0.3344361402
Mean21.8770751
Median Absolute Deviation (MAD)4
Skewness0.3536137042
Sum11069.8
Variance57.80523826
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36.962538
 
7.5%
258
 
1.6%
23.17
 
1.4%
227
 
1.4%
21.77
 
1.4%
20.66
 
1.2%
19.46
 
1.2%
20.15
 
1.0%
15.65
 
1.0%
22.65
 
1.0%
Other values (197)412
81.4%
ValueCountFrequency (%)
5.06252
0.4%
5.61
 
0.2%
6.31
 
0.2%
72
0.4%
7.23
0.6%
7.41
 
0.2%
7.51
 
0.2%
8.11
 
0.2%
8.32
0.4%
8.42
0.4%
ValueCountFrequency (%)
36.962538
7.5%
36.51
 
0.2%
36.41
 
0.2%
36.22
 
0.4%
36.11
 
0.2%
361
 
0.2%
35.42
 
0.4%
35.21
 
0.2%
35.11
 
0.2%
34.93
 
0.6%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTATprices
00.0063218.02.310.00.5386.57565.24.09001.0296.015.3396.904.9824.0
10.027310.07.070.00.4696.42178.94.96712.0242.017.8396.909.1421.6
20.027290.07.070.00.4697.18561.14.96712.0242.017.8392.834.0334.7
30.032370.02.180.00.4586.99845.86.06223.0222.018.7394.632.9433.4
40.069050.02.180.00.4587.14754.26.06223.0222.018.7396.905.3336.2
50.029850.02.180.00.4586.43058.76.06223.0222.018.7394.125.2128.7
60.0882912.57.870.00.5246.01266.65.56055.0311.015.2395.6012.4322.9
70.1445512.57.870.00.5246.17296.15.95055.0311.015.2396.9019.1527.1
80.2112412.57.870.00.5245.631100.06.08215.0311.015.2386.6329.9316.5
90.1700412.57.870.00.5246.00485.96.59215.0311.015.2386.7117.1018.9

Last rows

CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTATprices
4960.289600.09.690.00.5855.39072.92.79866.0391.019.2396.9021.1419.7
4970.268380.09.690.00.5855.79470.62.89276.0391.019.2396.9014.1018.3
4980.239120.09.690.00.5856.01965.32.40916.0391.019.2396.9012.9221.2
4990.177830.09.690.00.5855.56973.52.39996.0391.019.2395.7715.1017.5
5000.224380.09.690.00.5856.02779.72.49826.0391.019.2396.9014.3316.8
5010.062630.011.930.00.5736.59369.12.47861.0273.021.0391.999.6722.4
5020.045270.011.930.00.5736.12076.72.28751.0273.021.0396.909.0820.6
5030.060760.011.930.00.5736.97691.02.16751.0273.021.0396.905.6423.9
5040.109590.011.930.00.5736.79489.32.38891.0273.021.0393.456.4822.0
5050.047410.011.930.00.5736.03080.82.50501.0273.021.0396.907.8811.9